An efficient algorithm for finding short approximate non-tandem repeats
نویسندگان
چکیده
منابع مشابه
An efficient algorithm for finding short approximate non-tandem repeats
We study the problem of approximate non-tandem repeat extraction. Given a long subject string S of length N over a finite alphabet Sigma and a threshold D, we would like to find all short substrings of S of length P that repeat with at most D differences, i.e., insertions, deletions, and mismatches. We give a careful theoretical characterization of the set of seeds (i.e., some maximal exact rep...
متن کاملAn Algorithm for Approximate Tandem Repeats
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g., abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g., abcdaacd. In this paper we consider two criterions of similarity: the Hamming distance (k mismatches) and the edit distance (k differences). For a string S of lengt...
متن کاملSTAR: an algorithm to Search for Tandem Approximate Repeats
MOTIVATION Tandem repeats consist in approximate and adjacent repetitions of a DNA motif. Such repeats account for large portions of eukaryotic genomes and have also been found in other life kingdoms. Owing to their polymorphism, tandem repeats have proven useful in genome cartography, forensic and population studies, etc. Nevertheless, they are not systematically detected nor annotated in geno...
متن کاملTRStalker: an Efficient Heuristic for Finding NP-Complete Tandem Repeats
Genomic sequences in higher eucaryotic organisms contain a substantial amount of (almost) repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage, are characterized by close spatial contiguity, and play an important role in several molecular regulatory mechanisms. Certain types of tandem repeats are...
متن کاملTRStalker: an efficient heuristic for finding fuzzy tandem repeats
MOTIVATION Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2001
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/17.suppl_1.s5